\subsection{Sets and subsets}
A set is any* collection of mathematical objects.
\((\forall x, x \in A \iff x \in B) \iff (A = B)\).
In words, two sets which have the same members are considered to be the same; order of members is not important in a set.
There is no `multiple membership' of a set, \(\{ a, a \} = \{ a \}\).

Given a set \(A\) and a property \(p(x)\), we can form \(\{ x \in A: p(x) \}\); the subset of all members of \(A\) with property \(p\).
This is sometimes called the `subset selection' rule or axiom.
We can say that \(B\) is a subset of \(A\) if \(\forall x, x \in B \implies x \in A\), written \(B \subseteq A\).
Further, \(A = B \iff A \subseteq B, B \subseteq A\).

\subsection{Composing sets}
Given sets \(A\) and \(B\), we can form their union \(A \cup B = \{ x: x \in A \lor x \in B \}\).
We can also form their intersection \(A \cap B = \{ x: x \in A \wedge x \in B \}\).
If \(A \cap B = \varnothing\), we say \(A\) and \(B\) are disjoint.
Note that we could consider \(A \cap B\) as a special case of subset selection; the subset of \(A\) with the property that the element is in \(B\).
Therefore, \(A \cap B \subseteq A\), and \(A \cap B \subseteq B\).
We define the set difference \(A \setminus B = \{ x \in A: x \notin B \}\).

Note that \(\cap\) and \(\cup\) are commutative and associative.
Also, \(\cup\) is distributive over \(\cap\), and \(\cap\) is distributive over \(\cup\).
For example, let us prove that \(A \cap (B \cup C) = (A \cap B) \cup (A \cap C)\).
\begin{itemize}
	\item (LHS \(\subseteq\) RHS) Given \(x \in A \cap (B \cup C)\), we have \(x \in A\) and also either \(x \in B\) or \(x \in C\).
	      If \(x \in B\) then \(x \in A \cap B\) so \(x \in (A \cap B) \cup (A \cap C)\); and vice versa for \(C\).
	\item (RHS \(\subseteq\) LHS) Given \(x \in (A \cap B) \cup (A \cap C)\), either \(x \in A \cap B\) or \(x \in A \cap C\).
	      If \(x \in A \cap B\) then \(x \in A\) and \(x \in B \cup C\) as required; and vice versa for the other case.
\end{itemize}
As the union is associative, we can have bigger unions of more sets.
For example, if we let \(A_n = \{ n^2, n^3 \}\) for each \(n \in \mathbb N\), the infinite union
\[
	A_1 \cup A_2 \cup A_3 \cup \cdots = \bigcup_{n=1}^\infty A_n = \bigcup_{n \in \mathbb N} A_n = \{ x \in N: x \text{ is a square or a cube} \}
\]
When we use the \(n \in \mathbb N\) on the large union symbol, we call \(\mathbb N\) the `index set'.
Note that the infinite union is not defined as a limit of finite unions; it is simply defined using set comprehension.
In general, given a set \(I\), and sets \(A_i\), \(i \in I\), we can form
\[
	\bigcup_{i \in I}A_i = \{ x: \exists i \in I, x \in A_i \}
\]
and
\[
	\bigcap_{i \in I}A_i = \{ x: \forall i \in I, x \in A_i \}
\]
Note that we cannot form an intersection when \(I = \varnothing\), as will be explained later.

For any \(a, b\), we can form the ordered pair \((a, b)\), where equality is checked component-wise.
For sets \(A, B\), we can form their product \(A \times B = \{ (a, b) : a \in A, b \in B \}\).
For example, \(\mathbb R^2 = \mathbb R \times \mathbb R\) can be viewed as a plane.
We can form other sizes of tuples similarly.

For any set \(X\), we can form the power set \(\mathcal P(X)\) consisting of all subsets of \(X\).
\[
	\mathcal P(X) = \{ Y: Y \subseteq X \}
\]
For example:
\[
	\mathcal P(\{ 1, 2 \}) = \{ \varnothing, \{ 1 \}, \{ 2 \}, \{ 1, 2\} \}
\]

\subsection{Russell's paradox}
For a set \(A\), we can always form the set \(\{ x \in A: p(x) \}\) for any property \(p\).
We cannot, however, form the set \(\{ x: p(x) \}\).
Suppose we could form such a set, then we could form the set \(X = \{ x: x \notin x \}\).
Now, is \(X \in X\)?
If this is true, then it fails the defining property \(x \notin x\).
If this is false, then the defining property is true, so it must be in the set.
This is a contradiction in both cases.

Similarly, there is no `universal' set \(\mathscr E\), meaning \(\forall x, x \in \mathscr E\).
Otherwise we could form the \(X\) above by \(\{ x \in \mathscr E: p(x) \}\).
To guarantee that a given set exists, we need to obtain it in some way from known sets.

\subsection{Finite sets}
We will write \(\mathbb N_0 = \mathbb N \cup \{ 0 \}\).
For \(n \in \mathbb N_0\), we can say that a set \(A\) has size \(n\) if we can write \(A = \{ a_1, a_2, \cdots, a_n \}\) where the \(a_i\) are distinct.
A set is called finite if it has a size \(n \in \mathbb N_0\).

Note that a set cannot have size \(n\) and size \(m\) for \(n \neq m\).
Suppose that \(A\) has size \(n\) and size \(m\) where \(n, m > 0\).
Then, removing an element, we obtain a set that has size \(n-1\) and \(m-1\).
By induction on the larger of \(n\) and \(m\), we will eventually reach a size of both zero and nonzero which is a contradiction.

\begin{proposition}
	A set of size \(n\) has exactly \(2^n\) subsets.
\end{proposition}
\begin{proof}[Proof 1]
	We may assume that our set is simply \(\{ 1, 2, \cdots, n \}\) by relabelling.
	When constructing a subset \(S\) from this set, there are \(n\) independent binary choices for whether a given element should be within this subset, since for example either \(1 \in S\) or \(1 \notin S\) must be true.
	So there are \(2^n\) distinct choices of subset you can make.
\end{proof}
\begin{proof}[Proof 2]
	We will prove this inductively on \(n\), noting that \(n=0\) is trivial.
	For any subset \(T \subseteq \{ 1, 2, \cdots n-1 \}\), how many \(S \subseteq \{ 1, \cdots, n \}\) have \(S \cap \{ 1, 2, \cdots n-1 \} = T\)?
	Exactly two: \(T\) and \(T \cup \{ n \}\).
	So there are two choices for how to extend this subset to the new element \(n\).
	So the number of subsets is \(2 \cdot 2^{n-1} = 2^n\).
\end{proof}
In some sense Proof 2 is a more `formal' version of Proof 1, using induction rather than intuition.
We sometimes say that if \(A\) has size \(n\), then \(\abs{A} = n\), and that \(A\) is an \(n\)-set.

\subsection{Binomial coefficients}
For \(n \in \mathbb N_0\) and \(0 \leq k \leq n\), we can write \(\binom{n}{k}\) for the number of subsets of an \(n\)-set that are of size \(k\).
\[
	\binom{n}{k} = \abs{\left\{ S \subseteq \{ 1, 2, \dots, n \}: \abs{S} = k \right\}}
\]
For example, there are six 2-sets in a 4-set.
There is a formula for this, but generally this definition is a lot easier to use.
Note that \(\binom{n}{0} = 1\), \(\binom{n}{n} = 1\), and \(\binom{n}{1}=n\) where \(n>0\).

Note that \(\binom{n}{0} + \binom{n}{1} + \dots + \binom{n}{n} = 2^n\) as each side counts the number of subsets in an \(n\)-set.
Also:
\begin{enumerate}
	\item \(\binom{n}{k} = \binom{n}{n-k}\) (\(\forall n \in N_0, 0 \leq k \leq n\)).
	      Indeed, specifying which \(k\) members to pick for a subset is equivalent to specifying which \(n-k\) members not to pick.
	\item \(\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}\) (\(\forall n \in \mathbb N, 0 < k < n\)).
	      Indeed, the number of \(k\)-subsets of \(\{ 1, 2, \dots, n \}\) without \(n\) is \(\binom{n-1}{k}\).
	      The number of \(k\)-subsets of \(\{ 1, 2, \dots, n \}\) that do contain \(n\) is \(\binom{n-1}{k-1}\) as we must pick the remaining \(k-1\) elements of this new subset.
	      So in total, \(\binom{n-1}{k-1} + \binom{n-1}{k}\) encapsulates both possibilities.
\end{enumerate}
This last point illustrates that Pascal's Triangle will give all the binomial coefficients since it perfectly encapsulates the relationship between a given element of the triangle with two elements from the previous row.
The exact proof follows from the other known properties of the binomial coefficients.

\subsection{Computing binomial coefficients}
\begin{proposition}
	\[
		\binom{n}{k} = \frac{n(n-1)(n-2)\cdots(n-k+1)}{k(k-1)(k-2)\cdots(1)}
	\]
\end{proposition}
\begin{proof}
	The number of ways to name a \(k\)-set is \(n(n-1)(n-2)\cdots(n-k+1)\) because there are \(n\) ways to choose a first element, \(n-1\) ways to choose a second element, and so on.
	We have overcounted the \(k\)-sets, though---there are \(k(k-1)(k-2)\cdots(1)\) ways to name a given \(k\)-set because you have \(k\) choices for the first element, \(k-1\) choices for the second element, and so on.
	Hence the number of \(k\)-sets in \(\{ 1, 2, \dots, n \}\) is the required result.
\end{proof}
Note that we can also write
\[
	\binom{n}{k} = \frac{n!}{k!(n-k)!}
\]
but this is a very unwieldy formula to use especially by hand, so will be rarely used.
Further, we can make asymptotic approximations using this formula, for example \(\binom{n}{3} \sim \frac{n^3}{6}\) for large \(n\).

\subsection{Binomial theorem}
\begin{theorem}
	For all \(a, b \in \mathbb R, n \in \mathbb N\), we have
	\[
		(a+b)^n = \binom{n}{0}a^n + \binom{n}{1}a^{n-1}b + \binom{n}{2}a^{n-2}b^2 + \dots + \binom{n}{n}b^n
	\]
\end{theorem}
\begin{proof}
	When we expand \((a+b)^n = (a+b)(a+b)\dots(a+b)\), we obtain terms of the form \(a^k b^{n-k}\).
	To get a single term of this form, we must choose \(k\) brackets for which to take the \(a\) value in the expansion, and the other \(n-k\) brackets will take the \(b\) value.
	The number of terms of the form \(a^k b^{n-k}\) for a fixed \(k\) is therefore the amount of ways of choosing \(k\) brackets out of a total of \(n\), which is \(\binom{n}{k}\).
	So
	\[
		(a+b)^n = \sum_{k=0}^n \binom{n}{k}a^k b^{n-k} = \sum_{k=0}^n \binom{n}{n-k}a^k b^{n-k}
	\]
\end{proof}
For example, we can tell that \((1+x)^n\) reduces to
\[
	1 + nx + \frac{1}{2}n(n-1)x^2 + \frac{1}{3!}n(n-1)(n-2)x^3 + \dots + nx^{n-1} + x^n
\]
So when \(x\) is small, a good approximation to \((1+x)^n\) is \(1 + nx\).

\subsection{Inclusion-exclusion theorem}
Given two finite sets \(A\), \(B\), we have
\[
	\abs{A \cup B} = \abs{A} + \abs{B} - \abs{A \cap B}
\]
For three sets, we have
\[
	\abs{A \cup B \cup C} = \abs{A} + \abs{B} + \abs{C} - \abs{A \cap B} - \abs{B \cap C} - \abs{C \cap A} + \abs{A \cap B \cap C}
\]
\begin{theorem}
	Let \(S_1, \dots, S_n\) be finite sets.
	Then,
	\[
		\abs{\bigcup_{S \in S_n} S} = \sum_{\abs{A} = 1}\abs{S_A} - \sum_{\abs{A} = 2}\abs{S_A} + \sum_{\abs{A} = 3}\abs{S_A} - \cdots
	\]
	where
	\[
		S_A = \bigcap_{i \in A}S_i
	\]
	and
	\[
		\sum_{\abs{A} = k}
	\]
	is a sum taken over all \(A \subseteq \{ 1, 2, \dots, n \}\) of size \(k\).
\end{theorem}
\begin{proof}
	Let \(x\) be an element of the left hand side.
	We wish to prove that \(x\) is counted exactly once on the right hand side.
	Without loss of generality, let us rename the sets that \(x\) belongs to as \(S_1, S_2, \dots, S_k\).

	Then the number of sets \(A\) with \(\abs{A} = 1\) such that \(x \in S_A\) is \(k\).
	The number of sets \(A\) with \(\abs{A} = 2\) such that \(x \in S_a\) is \(\binom{k}{2}\), since we must choose two of the sets \(S_1, \dots, S_k\), so there are \(\binom{k}{2}\) ways to do this.
	So in general, the amount of \(A\) with \(\abs{A} = r\) with \(x \in S_A\) is just \(\binom{k}{r}\).

	So the number of times \(x\) is counted on the right hand side is
	\[
		k - \binom{k}{2} + \binom{k}{3} - \dots + (-1)^{k+1}\binom{k}{k}
	\]
	But \((1 + (-1))^k\) by the binomial expansion is
	\[
		1 - \binom{k}{1} + \binom{k}{2} - \binom{k}{3} + \dots + (-1)^k\binom{k}{k}
	\]
	So the number of times \(x\) is counted on the right hand side is \(1 - (1 + (-1))^k = 1 - 0 = 1\).
\end{proof}
